Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: fix 'select for update' on partitioned table again #30732

Closed
wants to merge 5 commits into from

Conversation

tiancaiamao
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #30382 #28073

Problem Summary:

This is a rework of #21148, that old fix introduce a extra partition ID column to the schema.
In theory, the performance is better, but it's error-prone. After PR 21148 it has introduced numerous bugs.
For example, #28073 #28292 #30489 ... and maybe many more

If we use this one, #28666 is unnecessary. It's a fix on the old code, while this one is a totally rework.

What is changed and how it works?

Now I avoid using the extra partition ID column.
The problem of partition ID column is, many of the executor need to fill the extra partition ID column, and if we forget to do it, then there would be bugs:

  • When the schema is wrong, column and row datatum mismatch
  • When the executor forget to fill the extra column, the extraPIDColumn will be missing

There are so many places involved, it's hard to fix all of them.

In this PR, I just avoid column pruning on the SelectLock, and the using partition pruning again to get the partition ID.

For #30382, the select for update lock will not lock the table of the subquery, although the plan is transformed into Join, just one side is locked.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Fix many bugs related to 'select ... for update' on partitioned table

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has not been approved.

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 15, 2021
@tiancaiamao tiancaiamao added the type/bugfix This PR fixes a bug. label Dec 15, 2021
cols := pt.VisibleCols()
offsets := make([]int, 0, len(cols))
for _, colInfo := range cols {
colName := fmt.Sprintf("%s.%s.%s", dbInfo.Name.L, tblInfo.Name.L, colInfo.Name.L)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be some colume names not in this format like _tidb_rowid that it could not be found in the schema?

Copy link
Contributor

@mjonss mjonss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some curious questions :)


for _, pt := range v.PartitionedTable {
tblInfo := pt.Meta()
e.partitionedTable[tblInfo.ID] = pt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the tblInfo.ID the logical table or the physical table (partition) ID here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tblInfo.ID is the logical table ID here


cols := pt.VisibleCols()
offsets := make([]int, 0, len(cols))
for _, colInfo := range cols {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create this double loop for each partition?
Are not all the partitions the same?
I wonder if we cannot use the HandleCols from tblID2Handle instead.
I think I managed something like that in my hack here.

}
return selectLock, nil
}

func addExtraPIDColumnToDataSource(p LogicalPlan, info *extraPIDInfo) error {
func collectPartitionTable(p LogicalPlan, input []table.PartitionedTable) []table.PartitionedTable {
switch raw := p.(type) {
case *DataSource:
// Fix issue 26250, do not add extra pid column to normal table.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this comment.

e.tblID2PIDColumnIndex[tblID] = offset
if len(v.PartitionedTable) > 0 {
e.partitionedTable = make(map[int64]table.PartitionedTable)
e.tblID2PtColsOffsets = make(map[int64][]int)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this? Is not the e.tblID2Handle enough?

if err != nil {
return err
}
input = collectPartitionTable(child, input)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you still collect the partition table of the aggregation?

Copy link
Member

@winoros winoros Dec 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also, you don't need to collect the things after a subquery.
For the two following queries

select * from t, (select * from t1) t1 where t.c=t1.id for update;
select * from t, t1 where t.c=t1.id for update;

The first would not lock the t1, only the second one would.

Maybe you can refer to the handleHelper of the PlanBuilder. It maintains a stack to record the information. Though it does some redundant work. But it's easy to add new things to it I think.

	// handleHelper records the handle column position for tables. Delete/Update/SelectLock/UnionScan may need this information.
	// It collects the information by the following procedure:
	//   Since we build the plan tree from bottom to top, we maintain a stack to record the current handle information.
	//   If it's a dataSource/tableDual node, we create a new map.
	//   If it's an aggregation, we pop the map and push a nil map since no handle information left.
	//   If it's a union, we pop all children's and push a nil map.
	//   If it's a join, we pop its children out then merge them and push the new map to stack.
	//   If we meet a subquery, it's clear that it's an independent problem so we just pop one map out when we finish building the subquery.
	handleHelper *handleColHelper

@ti-chi-bot
Copy link
Member

@tiancaiamao: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 16, 2021
@tiancaiamao
Copy link
Contributor Author

Use another strategy to fix the locking on partitioned table issue #31634
Close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TiDB panicked with index out of range [-1]
5 participants